Artificial intelligence (AI) models are increasingly utilized for synthesizing medical research, providing rapid summaries of complex topics like hypertension management. This paper compares the outputs of three AI systems—Grok (by xAI), ChatGPT (by OpenAI), and Perplexity—on the efficacy (blood pressure [BP] reductions) and safety (side effects) of five antihypertensive drug classes: beta blockers (BBs), angiotensin receptor blockers (ARBs), sodium-glucose cotransporter 2 (SGLT2) inhibitors, diuretics, and calcium channel blockers (CCBs). The original query focused on identifying classes with the least side effects and highest efficiency, including medication rankings within classes.
The comparison evaluates consistency, accuracy against peer-reviewed meta-analyses, and structural adherence to research formats (e.g., IMRaD). Data from AI outputs were qualitatively and quantitatively analyzed for alignment with evidence-based medicine. This analysis highlights strengths in synthesis (e.g., table usage) and potential biases or variances in cited data.
| Aspect | Grok | ChatGPT | Perplexity |
|---|---|---|---|
| Class with Least Side Effects | ARBs (low withdrawal rates, minimal metabolic impact) | ARBs and CCBs (favorable tolerability, low serious effects) | ARBs (most tolerable), followed by CCBs |
| Class with Highest Efficacy (BP Reduction) | Diuretics and CCBs (9-11 mmHg SBP) | Diuretics (~12 mmHg SBP in some data), but overall similar across classes (~8-9 mmHg) | CCBs (10-15 mmHg SBP), followed by Diuretics (9-12 mmHg) |
| Average SBP/DBP Reductions (mmHg) | BBs: 8-9/5-6; ARBs: 8-10/4-5; SGLT2: 3-4/1-2; Diuretics: 9-11/5-6; CCBs: 9-10/5-6 | BBs: ~8.7/6; ARBs: ~8.7/similar; SGLT2: 3.6-5.1/1.7-2.7; Diuretics: ~12/7; CCBs: 8-10/6 | BBs: 10/8; ARBs: 8-13/5-8; SGLT2: 3.5-5/1.7-1.9; Diuretics: 9-12/4-5; CCBs: 10-15/5-8 |
| Common Side Effects Summary | BBs: Fatigue, bradycardia; ARBs: Dizziness, hyperkalemia; SGLT2: Infections, dehydration; Diuretics: Electrolyte issues, gout; CCBs: Edema, headache | BBs: Fatigue, dizziness; ARBs: Dizziness, headache; SGLT2: Polyuria, infections; Diuretics: Electrolytes, gout; CCBs: Edema, flushing | BBs: Bradycardia, fatigue; ARBs: Dizziness, headache; SGLT2: Infections, ketoacidosis; Diuretics: Hypokalemia, gout; CCBs: Edema, flushing |
| Medication Rankings (Top 1-2 per Class) | BBs: Metoprolol, Atenolol; ARBs: Losartan, Valsartan; SGLT2: Empagliflozin, Dapagliflozin; Diuretics: Chlorthalidone, HCTZ; CCBs: Amlodipine, Nifedipine | Not explicitly ranked; notes on agents like amlodipine (CCBs) | BBs: Metoprolol, Bisoprolol; ARBs: Valsartan, Losartan; SGLT2: Empagliflozin, Dapagliflozin; Diuretics: Chlorthalidone, Indapamide; CCBs: Amlodipine, Nifedipine |
| Reference Quality | 15 peer-reviewed citations (e.g., Cochrane, Lancet, AHA/ACC 2025); focused on recent meta-analyses | 11 citations (e.g., PubMed meta-analyses, Cochrane); includes network meta-analyses | 28 citations/links (e.g., NCBI, PMC, JAMA); mix of reviews and guidelines |
This review involved a qualitative comparison of the provided AI outputs, which were generated in response to a standardized query on antihypertensive comparisons. Outputs were parsed for key elements: conclusions on side effects and efficacy, BP reduction data, side effect lists, medication rankings, and references. Discrepancies were cross-referenced against peer-reviewed meta-analyses obtained via web searches (e.g., PubMed, BMC Medicine) using queries like "meta-analysis blood pressure lowering efficacy antihypertensive drug classes" and "SGLT2 inhibitors blood pressure reduction meta-analysis hypertension." Inclusion criteria: publications from 2017-2025, focusing on RCTs or meta-analyses with quantitative BP data and side effect profiles. No primary data collection; synthesis emphasized alignment with evidence.
The table above summarizes the AI outputs, revealing high consistency in identifying ARBs as having the least side effects (all three AIs) and SGLT2 inhibitors as having the most modest BP reductions (3-5 mmHg SBP across outputs). Efficacy rankings varied slightly: Grok and Perplexity favored diuretics/CCBs for highest BP drops (9-15 mmHg SBP), while ChatGPT noted diuretics but emphasized overall similarity (~8-9 mmHg). Side effect lists were similar, with BBs linked to fatigue/bradycardia, ARBs to dizziness, SGLT2 to infections, diuretics to electrolytes/gout, and CCBs to edema/flushing.
Medication rankings showed overlap (e.g., metoprolol for BBs, losartan/valsartan for ARBs, empagliflozin/dapagliflozin for SGLT2, chlorthalidone for diuretics, amlodipine for CCBs). Reference counts: Grok (15), ChatGPT (11), Perplexity (28 with links). All adhered to IMRaD, with tables near the top, but Perplexity included an abstract and conclusion explicitly.
Cross-validation with meta-analyses: Average monotherapy BP reduction aligns with ~8.7 mmHg SBP (from RCTs)[1]. SGLT2: ~3.6/1.7 mmHg ambulatory[5]. CV outcomes favor ARBs/CCBs over BBs[2], with CCBs/diuretics best for stroke prevention[3].
The AI outputs demonstrate strong concordance, reflecting shared access to medical literature. All correctly identify ARBs' tolerability advantages (low metabolic impact, rare serious effects) and SGLT2's modest efficacy, consistent with meta-analyses showing ~3-4 mmHg SBP reductions and infection risks[4][5]. Discrepancies include BP estimates: ChatGPT's higher diuretic efficacy (~12 mmHg) may draw from older data, while Grok/Perplexity align better with recent averages (~9-11 mmHg)[1]. Grok provided the most structured rankings, Perplexity the most references (though some non-peer-reviewed like GoodRx), and ChatGPT balanced narrative.
Strengths: All used tables effectively, cited peer-reviewed sources, and noted limitations (e.g., heterogeneity). Weaknesses: Varied specificity (e.g., no uniform DBP data), potential overestimation (Perplexity's 10-15 mmHg for CCBs exceeds meta-analytic averages). Future AI syntheses could standardize against benchmarks like the 2025 Lancet meta-analysis[1]. Clinically, these outputs support ARBs for tolerability and diuretics/CCBs for efficacy, but human oversight is essential for patient-specific decisions.